Chapter Outline

  • Standardizing observations for comparisons
  • The normal model
  • 68-95-99.7 rule for the normal model
  • Finding percentiles and areas for given observations from a normal model
  • Finding observations for given percentiles or areas from a normal model

Standardizing observations

Comparing two variables

  • ACT score
  • SAT score
  • Different scale
  • Measuring similar values
  • How do you compare them?

Example 1

  • The average score on the ACT English exam is 21.0, with a standard deviation of 4.0. The average score on the SAT verbal exam is 520 with a standard deviation of 100.
    • Albert scores a 27 on the ACT English exam
    • Alfred scores a 770 on the SAT Verbal exam
  • Who has the better score? Albert or Alfred?

Standardizing observations

  • \(y =\) observation of quantitative variable
  • How does the value of \(y\) relate to the mean value?
  • How does the value of \(y\) for this quantitative variable relate to another observation of a different quantitative variable?

Standardizing variables

\[z = \frac{y - \bar{y}}{s}\]

  • \(z\) has no units (just a number)
  • Puts observations on same scale
    • Mean (center) at 0
    • Standard deviation (spread) of 1
  • Does not change overall shape of the distribution

Standardizing variables

  • \(z =\) # of standard deviations observation is away from mean
    • Negative \(z\): observation is below mean
      • Ex: \(z=-1.5\Rightarrow\) observation is 1.5 standard devations below mean
    • Positive \(z\): observation is above mean
      • Ex: \(z=0.5\Rightarrow\) observation is 0.5 standard deviations above mean

Recall: Example 1

  • The average score on the ACT English exam is 21.0, with a standard deviation of 4.0. The average score on the SAT verbal exam is 520 with a standard deviation of 100.
    • Albert scores a 27 on the ACT English exam
    • Alfred scores a 770 on the SAT Verbal exam
  • How do we answer who has the better score?
    • Compute z-scores and compare

Example 1

  • Albert \[z = \frac{27-21}{4} = 1.5\]
  • Alfred \[z = \frac{770 - 520}{100} = 2.5\]
  • Who has the better scores?
    • Albert's score is 1.5 standard deviations above the mean, while Alfred's score is 2.5 standard deviations above the mean

Example 2

  • The average score on the ACT Math exam is 20.7 with a standard deviation of 4.1. The average score on the SAT Math exam is 510 with a standard deviation of 100
    • Alberta scores a 15 on the ACT Math exam
    • Fredda scores a 340 on the SAT Math exam
  • Who has the better score? Alberta or Fredda?

Example 2

  • Alberta \[z = \frac{15-20.7}{4.1} = -1.39\]
  • Fredda \[z = \frac{340 - 510}{100} = -1.70\]
  • Who has the better score?
    • Alberta because she has a larger z-score
    • Alberta is only 1.39 standard deviations below the mean while Fredda is 1.70 standard deviations below the mean

Distributions and standardizing

  • Standardizing
    • Allows you to make comparisons of observations between different variables
    • Without the distribution information, you still don't know anything about the percentile value of your observation
    • This percentile value depends on the distribution

Models for data distributions

  • Different models according to different aspects of distributions
    • Shape
    • Center
    • Variability

The Normal Model

Normal Model

  • Shape
    • Unimodal
    • Symmetric
    • Bell-shaped
  • Determined by two parameters
    • Mean \((\mu)\)
    • Standard deviation \((\sigma)\)

Connection to data

  • No data distribution follows a normal model exactly
  • Many data distributions are very close though
  • How do you know?
    • Histogram
    • Normal Quantile Plot (will learn in lab)

Example

A closer look at the normal model

  • Mean \(\mu\)
    • Locates the center of the distribution
    • Splits the curve in half
  • Standard deviation \(\sigma\)
    • Controls the variability of curve
    • Ruler of distribution
  • Write as \(N(\mu, \sigma)\)

Example

  • Height of men is normally distributed
    • Mean \(\mu = 70\) inches
    • Standard deviation \(\sigma = 3\) inches
  • So, the height of men is distributed \(N(70,3)\)
  • What percent of men have heights less than 70 inches?
    • 50%
  • What percent of men have heights greater than 70 inches?
    • 50%

Example - Height of men

68-95-99.7 rule

68-95-99.7 rule

  • Approximately 68% of observations are within 1 \(\sigma\) of the mean \(\mu\)
  • If \(\mu = 0\) and \(\sigma = 1\), approximately 68% of the observations are between -1 and 1

68-95-99.7 rule

68-95-99.7 rule

  • Approximately 95% of observations are within \(2\sigma\) of the mean \(\mu\)
  • If \(\mu = 0\) and \(\sigma = 1\), approximately 95% of the observations are between \(-2\) and \(2\)

68-95-99.7 rule

68-95-99.7 rule

  • Approximately 99.7% of observations are within \(3\sigma\) of the mean \(\mu\)
  • If \(\mu = 0\) and \(\sigma = 1\), approximately 99.7% of the observations are between \(-3\) and \(3\)

68-95-99.7 rule

Example - Height of men

  • Height of men \(\sim N(70, 3)\)
  • 68-95-99.7 rule
    • 68% of men will have heights between which two values?
      • \(\mu-\sigma = 70 - 3 = 67\) inches
      • \(\mu+\sigma = 70 + 3 = 73\) inches

Example - Height of men

68% rule implies

68% rule implies

68% rule implies

68% rule implies

68% rule implies

68% rule implies

68% rule implies

Example

  • Height of men \(\sim N(70,3)\)
  • 68-95-99.7 rule
    • 95% of men will have heights between which two values?
      • \(\mu - 2\sigma = 70 - 2*2 = 64\) inches
      • \(\mu + 2\sigma = 70 + 2*2 = 76\) inches

Example

  • Height of men \(\sim N(70,3)\)
  • 68-95-99.7 rule
    • 99.7% of men will have heights between which two values?
      • \(\mu - 3\sigma = 70 - 3*2 = 61\) inches
      • \(\mu + 3\sigma = 70 + 3*2 = 79\) inches

Beyond the 68-95-99.7 rule

  • Z-table
    • Connection between normal values and percentiles for standard normal model
  • Standard normal model
    • Denoted by \(Z\)
    • Mean \(\mu = 0\)
    • Standard deviation \(\sigma = 1\)

Finding percentiles and areas for given observations from a normal model

Z Table

  • Table gives proportion of curve below a particular \(z\) score (the percentile for the value \(z\))
    • \(z\) values range from \(-3.90\) to \(3.90\) on the table
    • Row - ones and tenths place for \(z\)
    • Column - hundredths place for \(z\)

Percentile for z = -1.50

Percentile for z = -1.50

Percentile for z = 1.98

Percentile for z = 1.98

Proportion greater than z = -1.65

Proportion greater than z = -1.65

Proportion greater than z = 0.73

Proportion greater than z = 0.73

Proportion between z = 0.5 and z = 1.4

Proportion between z = 0.5 and z = 1.4

Proportion between z = -2.3 and z = -0.05

Proportion between z = -2.3 and z = -0.05

Finding z from a given percentile

Finding z from a given percentile

Finding z from a given percentile

Finding z from a given percentile

Finding z from a given percentile

Finding z from a given area

Finding z from a given area

Finding z from a given area

Finding z from a given area

Finding z from a given area

Finding z from a given area

Finding z from a given area

Finding z from a given area

Standardizing

  • Changes any normal model to a standard normal model \[z = \frac{y - \mu}{\sigma}\]
  • \(z =\) # of standard deviations away from mean \(\mu\)
    • Negative \(z\Rightarrow\) number is below mean
    • Positive \(z\Rightarrow\) number is above mean

Example

  • The height of men follows a normal distribution with mean 70 inches and standard deviation 3 inches
    • Standardize \(y=68\) \[z = \frac{68-70}{3} = -0.67\]
    • Standardize \(y=74\) \[z = \frac{74-70}{3} = 1.33\]